Three datasets necessary to reproduce figures in our SC21 submission titled "Understanding why machine learning models of \\ I/O fail: A taxonomy of I/O throughput modelling errors". The darshan_theta_2017_2020.csv file is a CSV file constructed from Darshan logs, where every row represents an HPC job ran on ALCF Theta, and each column is a different feature of the job. This data is post-processed, in order to simplify reproduction of the paper. It is also anonymized, where the apps_short column represents the anonymized name of the application. The cobalt_theta_2017_2020.csv file contains Cobalt scheduler logs, where UIDs of allocations correspond to Darshan job UIDs. This data is also public, and is not preprocessed, only aggregated o...
Currently for small-scale machine learning projects, there is no limit which has been set by its res...
The zip archive contains three datasets used during the experimental phase of the paper: ANTICIPA...
Supporting datasets (iid and ood) used in the evaluation experiments of the paper "Why did AI get th...
Data and scripts needed to reproduce the figures in our SC22 submission. See the README.md file to ...
The data certification of the CMS experiment data is an essential process to guarantee high quality ...
Supplementary Information for "Autonomous data extraction from peer reviewed literature for training...
The impact of the Artificial Intelligence revolution is undoubtedly substantial in our society, life...
Three datasets created with information in existing systematic literature review publications. Hall...
Machine learning teaches computers to think in a similar way to how humans do. An ML models work by ...
Missing data is an intrinsic problem of broad science and engineering. In the emerging era of big da...
This is the replication package for the article "Machine Learning for the Identification and Classif...
Machine learning (ML) is now commonplace, powering data-driven applications in various organizations...
The main dataset from a year’s of running the SGRA on the HPC (Summit) environment is divided into t...
A count matrix undergoes pre-processing, including normalization and filtering. The data is randomly...
Datasets available at UCI Machine Learning Repository and other repositories. List of datasets used...
Currently for small-scale machine learning projects, there is no limit which has been set by its res...
The zip archive contains three datasets used during the experimental phase of the paper: ANTICIPA...
Supporting datasets (iid and ood) used in the evaluation experiments of the paper "Why did AI get th...
Data and scripts needed to reproduce the figures in our SC22 submission. See the README.md file to ...
The data certification of the CMS experiment data is an essential process to guarantee high quality ...
Supplementary Information for "Autonomous data extraction from peer reviewed literature for training...
The impact of the Artificial Intelligence revolution is undoubtedly substantial in our society, life...
Three datasets created with information in existing systematic literature review publications. Hall...
Machine learning teaches computers to think in a similar way to how humans do. An ML models work by ...
Missing data is an intrinsic problem of broad science and engineering. In the emerging era of big da...
This is the replication package for the article "Machine Learning for the Identification and Classif...
Machine learning (ML) is now commonplace, powering data-driven applications in various organizations...
The main dataset from a year’s of running the SGRA on the HPC (Summit) environment is divided into t...
A count matrix undergoes pre-processing, including normalization and filtering. The data is randomly...
Datasets available at UCI Machine Learning Repository and other repositories. List of datasets used...
Currently for small-scale machine learning projects, there is no limit which has been set by its res...
The zip archive contains three datasets used during the experimental phase of the paper: ANTICIPA...
Supporting datasets (iid and ood) used in the evaluation experiments of the paper "Why did AI get th...